Model Selection

Multimodal Video QA

# Multimodal Video QA

Videollama2 7B 16F Base

VideoLLaMA 2 is a multimodal large language model focused on enhancing spatio-temporal modeling and audio understanding in video comprehension.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase